Add per-attempt LLM spans under call-level retry (0050) by chris-colinsky · Pull Request #170 · LunarCommand/openarmature-python

chris-colinsky · 2026-06-19T15:44:11Z

Summary

Completes proposal 0050 by implementing the call-level-retry per-attempt LLM span surface (observability §5.5 / llm-provider §7.1). 0050 shipped partial in v0.14.0 (failure-isolation middleware and the complete(retry=...) loop); this branch lands the deferred piece: under call-level retry, the OTel observer now emits one openarmature.llm.complete span per attempt rather than one per call.

What changed

New per-attempt event. A python-internal LlmRetryAttemptEvent (frozen, exported from openarmature.graph) is dispatched once per in-call attempt, carrying that attempt's identity / scoping, request-side fields, and outcome (error_category is None discriminates success from failure).
Provider emit. OpenAIProvider.complete() dispatches one LlmRetryAttemptEvent per attempt, including the single attempt of a no-retry call (at index 0), with per-attempt latency that excludes backoff. The terminal LlmCompletionEvent / LlmFailedEvent are unchanged: still exactly one per call.
Observer render. The OTel observer renders the openarmature.llm.complete span(s) solely from LlmRetryAttemptEvent, each tagged openarmature.llm.attempt_index (0..N-1). A failed intermediate attempt carries ERROR plus the §4 category plus the request-side attributes; the final (or single) attempt carries the full §5.5 response surface. The two terminal events no longer drive the OTel span; they stay on the queue for the Langfuse mapping and payload / latency consumers. This collapses the previous completion and failed handlers into one.
Langfuse renders one terminal Generation per call (the Langfuse observer ignores LlmRetryAttemptEvent).
Manifest, docs, changelog. conformance.toml flips 0050 to implemented (since 0.15.0); the observability concepts page documents the attribute, the per-attempt span behavior, and the enricher / consuming-event implications; a 0.15.0 changelog section lands, also backfilling the 0061 detached-trace span entry.

Design notes

LlmRetryAttemptEvent is python-internal, not a spec-normative event type. The per-attempt span contract is the already-accepted observability §5.5 (one span per attempt, openarmature.llm.attempt_index 0..N-1); §5.5 does not pin which internal event the observer renders from, so making this event the sole span source is an implementation choice. The guardrail: each per-attempt span carries the full §5.5 attribute surface, verified by fixtures 057 and 016-021 / 040-042 staying green. For Langfuse, terminal-Generation-per-call is the intended shape; §8 is currently silent on call-level retry, and a spec-side clarification to pin it is tracked (non-blocking).

Testing

Spec conformance fixtures 056-058 (transient-then-success, exhaustion, non-transient) driven through the provider plus OTel observer; obs-057 (single-attempt) wired in the conformance harness.
New regression test asserting terminal events produce zero OTel spans.
Full suite green (1326 passed); pyright and mkdocs clean.

Notes

The 0.15.0 changelog date is tentative pending the release tag.
A separate follow-up will normalize conformance.toml's proposal note style; not in this PR.

Flip conformance.toml [proposals."0050"] partial -> implemented (since 0.15.0): the call-level-retry per-attempt span surface now ships. Document the openarmature.llm.attempt_index attribute and the per-attempt span behavior in the observability concepts page, plus notes that span enrichers receive LlmRetryAttemptEvent on the LLM span and that the bundled provider dispatches that internal event alongside the unchanged terminal events. Add the 0.15.0 changelog section covering this work and backfilling the 0061 detached-trace invocation span (which landed without an entry), plus the v0.60.0 -> v0.61.0 spec-pin bullet.

_build_llm_retry_attempt_event constructed a full LlmRetryAttemptEvent twice, repeating ~18 shared identity, scoping, and request-side fields across the success and failure branches. Hoist them into one base dict and splat it, leaving each branch to add only its outcome fields. No behavior change.

The OTel observer now renders the LLM span solely from the per-attempt LlmRetryAttemptEvent; terminal LlmCompletionEvent / LlmFailedEvent are ignored. Add a regression test feeding both terminal events and asserting zero openarmature.llm.complete spans, guarding against reintroducing the terminal-event span path. Also fix a stale docstring in _drive_llm_span_with_cached_tokens that still referenced "typed LlmCompletionEvent".

Copilot

Pull request overview

Implements proposal 0050’s observability §5.5 “per-attempt LLM spans” by introducing a new per-attempt internal event type and switching the OTel observer to render openarmature.llm.complete spans exclusively from that per-attempt event (one span per call-level retry attempt), while keeping the terminal LlmCompletionEvent / LlmFailedEvent as one-per-call for non-OTel consumers.

Changes:

Add LlmRetryAttemptEvent and dispatch it once per in-call attempt from OpenAIProvider.complete() (attempt latency excludes backoff).
Update OTel observer to create per-attempt openarmature.llm.complete spans from LlmRetryAttemptEvent and ignore terminal LLM events for span rendering; Langfuse ignores the new per-attempt event.
Update tests, conformance harness behavior, docs, conformance manifest, and changelog to reflect the per-attempt span contract and proposal 0050 being fully implemented.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/unit/test_observability_otel.py	Updates unit tests to drive OTel spans from per-attempt events; adds regression coverage for ignoring terminal events; adds fixture-driven per-attempt span assertions.
tests/unit/test_llm_provider.py	Updates provider emission-shape test to expect per-attempt event followed by terminal event on success.
tests/conformance/test_observability.py	Wires new observability fixture; excludes per-attempt internal event from conformance collector stream.
tests/conformance/test_llm_provider.py	Notes that call-level retry fixtures’ per-attempt spans are asserted in OTel unit tests rather than the provider conformance harness.
tests/_helpers/typed_event.py	Adds helper for constructing `LlmRetryAttemptEvent` for tests.
src/openarmature/observability/otel/observer.py	Renders `openarmature.llm.complete` spans from `LlmRetryAttemptEvent` (one per attempt), and ignores terminal LLM events for span creation.
src/openarmature/observability/langfuse/observer.py	Explicitly ignores `LlmRetryAttemptEvent` to keep one Generation per call from terminal events.
src/openarmature/observability/correlation.py	Extends dispatch/observer event unions to include `LlmRetryAttemptEvent`.
src/openarmature/llm/providers/openai.py	Emits per-attempt events within the call-level retry loop via a callback, keeping terminal event behavior unchanged.
src/openarmature/graph/observer.py	Extends `ObserverEvent` union and docs to include the per-attempt internal event.
src/openarmature/graph/events.py	Adds the `LlmRetryAttemptEvent` dataclass and exports it from `openarmature.graph.events`.
docs/concepts/observability.md	Documents per-attempt spans, `openarmature.llm.attempt_index`, and enricher/consumer implications.
conformance.toml	Marks proposal 0050 as `implemented` since 0.15.0 with updated narrative.
CHANGELOG.md	Adds 0.15.0 entries describing per-attempt spans and detached-trace invocation span; records spec pin advance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

PR #170 CoPilot review: - Re-export LlmRetryAttemptEvent from the openarmature.graph package (import block + __all__), matching the sibling LlmCompletionEvent / LlmFailedEvent so the documented observer import path works. - Replace the brittle type(event).__name__ name match with an isinstance check in the conformance _TypedEventCollector; the filter_event_type string comparison stays as-is.

chris-colinsky added 9 commits June 18, 2026 18:04

Add LlmRetryAttemptEvent for per-attempt LLM spans (0050)

9a7a4b0

Emit per-attempt LlmRetryAttemptEvent from complete() (0050)

2fb069f

Render per-attempt LLM spans from LlmRetryAttemptEvent (0050)

e6f6285

Activate obs-057 (single-attempt llm.attempt_index) (0050)

beea3ab

Add N-span call-level-retry integration test (0050)

372f0e6

Activate call-level-retry per-attempt span fixtures (0050)

f20f429

Copilot AI review requested due to automatic review settings June 19, 2026 15:44

Copilot started reviewing on behalf of chris-colinsky June 19, 2026 15:44 View session

Copilot AI reviewed Jun 19, 2026

View reviewed changes

Comment thread src/openarmature/graph/events.py

Comment thread tests/conformance/test_observability.py

chris-colinsky merged commit 7224e30 into main Jun 19, 2026
6 checks passed

chris-colinsky deleted the feature/0050-per-attempt-llm-spans branch June 19, 2026 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add per-attempt LLM spans under call-level retry (0050)#170

Add per-attempt LLM spans under call-level retry (0050)#170
chris-colinsky merged 10 commits into
mainfrom
feature/0050-per-attempt-llm-spans

chris-colinsky commented Jun 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

chris-colinsky commented Jun 19, 2026

Summary

What changed

Design notes

Testing

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants